Dataset statistics
| Dataset A | Dataset B | |
|---|---|---|
| Number of variables | 12 | 12 |
| Number of observations | 446 | 446 |
| Missing cells | 428 | 431 |
| Missing cells (%) | 8.0% | 8.1% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 45.3 KiB | 45.3 KiB |
| Average record size in memory | 104.0 B | 104.0 B |
Variable types
| Dataset A | Dataset B | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 4 |
| Text | 3 | 3 |
| Dataset A | Dataset B | |
|---|---|---|
Age has 84 (18.8%) missing values | Age has 87 (19.5%) missing values | Missing |
Cabin has 343 (76.9%) missing values | Cabin has 344 (77.1%) missing values | Missing |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 306 (68.6%) zeros | SibSp has 293 (65.7%) zeros | Zeros |
Parch has 337 (75.6%) zeros | Parch has 339 (76.0%) zeros | Zeros |
Fare has 9 (2.0%) zeros | Fare has 7 (1.6%) zeros | Zeros |
Reproduction
| Dataset A | Dataset B | |
|---|---|---|
| Analysis started | 2024-03-18 18:33:39.300534 | 2024-03-18 18:33:43.302154 |
| Analysis finished | 2024-03-18 18:33:43.301050 | 2024-03-18 18:33:46.300350 |
| Duration | 4 seconds | 3 seconds |
| Software version | ydata-profiling v0.0.dev0 | ydata-profiling v0.0.dev0 |
| Download configuration | config.json | config.json |
PassengerId
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 429.80717 | 435.92152 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 1 | 2 |
| Maximum | 891 | 887 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 1 | 2 |
| 5-th percentile | 42.5 | 47.25 |
| Q1 | 208.25 | 217.5 |
| median | 413 | 432.5 |
| Q3 | 648.75 | 667.25 |
| 95-th percentile | 841.75 | 830 |
| Maximum | 891 | 887 |
| Range | 890 | 885 |
| Interquartile range (IQR) | 440.5 | 449.75 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 256.13708 | 256.90053 |
| Coefficient of variation (CV) | 0.59593487 | 0.58932748 |
| Kurtosis | -1.1850064 | -1.2449463 |
| Mean | 429.80717 | 435.92152 |
| Median Absolute Deviation (MAD) | 221 | 222.5 |
| Skewness | 0.1154245 | 0.032879239 |
| Sum | 191694 | 194421 |
| Variance | 65606.205 | 65997.884 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 278 | 1 | 0.2% |
| 187 | 1 | 0.2% |
| 1 | 1 | 0.2% |
| 331 | 1 | 0.2% |
| 833 | 1 | 0.2% |
| 103 | 1 | 0.2% |
| 448 | 1 | 0.2% |
| 220 | 1 | 0.2% |
| 784 | 1 | 0.2% |
| 641 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 51 | 1 | 0.2% |
| 823 | 1 | 0.2% |
| 875 | 1 | 0.2% |
| 365 | 1 | 0.2% |
| 797 | 1 | 0.2% |
| 110 | 1 | 0.2% |
| 564 | 1 | 0.2% |
| 293 | 1 | 0.2% |
| 521 | 1 | 0.2% |
| 411 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 6 | 1 | |
| 9 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 16 | 1 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 3 | 1 | |
| 7 | 1 | |
| 9 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 16 | 1 | |
| 17 | 1 | |
| 18 | 1 | |
| 20 | 1 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 3 | 1 | |
| 7 | 1 | |
| 9 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 16 | 1 | |
| 17 | 1 | |
| 18 | 1 | |
| 20 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 6 | 1 | |
| 9 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 16 | 1 |
Survived
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 0 | |
|---|---|
| 1 |
| 0 | |
|---|---|
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 2 | 2 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 0 | 0 |
| 2nd row | 0 | 1 |
| 3rd row | 1 | 1 |
| 4th row | 0 | 1 |
| 5th row | 0 | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 282 | |
| 1 | 164 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 0 | 282 | |
| 1 | 164 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 282 | |
| 1 | 164 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 282 | |
| 1 | 164 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 282 | |
| 1 | 164 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 282 | |
| 1 | 164 |
| Value | Count | Frequency (%) |
| 0 | 268 | |
| 1 | 178 |
Pclass
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 2 | 3 |
| 2nd row | 2 | 3 |
| 3rd row | 3 | 3 |
| 4th row | 3 | 3 |
| 5th row | 3 | 3 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 110 | |
| 2 | 93 | 20.9% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 105 | |
| 2 | 96 | 21.5% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 110 | |
| 2 | 93 | 20.9% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 105 | |
| 2 | 96 | 21.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 110 | |
| 2 | 93 | 20.9% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 105 | |
| 2 | 96 | 21.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 110 | |
| 2 | 93 | 20.9% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 105 | |
| 2 | 96 | 21.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 110 | |
| 2 | 93 | 20.9% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 105 | |
| 2 | 96 | 21.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 243 | |
| 1 | 110 | |
| 2 | 93 | 20.9% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 105 | |
| 2 | 96 | 21.5% |
Name
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 82 | 82 |
| Median length | 48 | 49 |
| Mean length | 27.029148 | 26.921525 |
| Min length | 13 | 12 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 12055 | 12007 |
| Distinct characters | 59 | 60 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 446 | 446 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | Parkes, Mr. Francis "Frank" | Panula, Master. Juha Niilo |
| 2nd row | Hold, Mr. Stephen | Madsen, Mr. Fridtjof Arne |
| 3rd row | Moor, Mrs. (Beila) | Stranden, Mr. Juho |
| 4th row | Coleff, Mr. Satio | Peter, Mrs. Catherine (Catherine Rizk) |
| 5th row | Flynn, Mr. James | Mannion, Miss. Margareth |
| Value | Count | Frequency (%) |
| mr | 265 | 14.5% |
| miss | 90 | 4.9% |
| mrs | 57 | 3.1% |
| william | 33 | 1.8% |
| john | 25 | 1.4% |
| master | 23 | 1.3% |
| henry | 22 | 1.2% |
| george | 15 | 0.8% |
| james | 15 | 0.8% |
| thomas | 12 | 0.7% |
| Other values (886) | 1265 |
| Value | Count | Frequency (%) |
| mr | 257 | 14.1% |
| miss | 94 | 5.2% |
| mrs | 68 | 3.7% |
| william | 34 | 1.9% |
| john | 24 | 1.3% |
| master | 18 | 1.0% |
| george | 13 | 0.7% |
| edward | 12 | 0.7% |
| charles | 12 | 0.7% |
| henry | 11 | 0.6% |
| Other values (916) | 1276 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1377 | 11.4% | |
| r | 1009 | 8.4% |
| e | 883 | 7.3% |
| a | 799 | 6.6% |
| i | 651 | 5.4% |
| n | 641 | 5.3% |
| s | 638 | 5.3% |
| M | 560 | 4.6% |
| l | 525 | 4.4% |
| o | 523 | 4.3% |
| Other values (49) | 4449 |
| Value | Count | Frequency (%) |
| 1375 | 11.5% | |
| r | 949 | 7.9% |
| e | 855 | 7.1% |
| a | 813 | 6.8% |
| i | 662 | 5.5% |
| n | 650 | 5.4% |
| s | 641 | 5.3% |
| M | 559 | 4.7% |
| l | 539 | 4.5% |
| o | 520 | 4.3% |
| Other values (50) | 4444 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 12055 |
| Value | Count | Frequency (%) |
| (unknown) | 12007 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1377 | 11.4% | |
| r | 1009 | 8.4% |
| e | 883 | 7.3% |
| a | 799 | 6.6% |
| i | 651 | 5.4% |
| n | 641 | 5.3% |
| s | 638 | 5.3% |
| M | 560 | 4.6% |
| l | 525 | 4.4% |
| o | 523 | 4.3% |
| Other values (49) | 4449 |
| Value | Count | Frequency (%) |
| 1375 | 11.5% | |
| r | 949 | 7.9% |
| e | 855 | 7.1% |
| a | 813 | 6.8% |
| i | 662 | 5.5% |
| n | 650 | 5.4% |
| s | 641 | 5.3% |
| M | 559 | 4.7% |
| l | 539 | 4.5% |
| o | 520 | 4.3% |
| Other values (50) | 4444 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 12055 |
| Value | Count | Frequency (%) |
| (unknown) | 12007 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1377 | 11.4% | |
| r | 1009 | 8.4% |
| e | 883 | 7.3% |
| a | 799 | 6.6% |
| i | 651 | 5.4% |
| n | 641 | 5.3% |
| s | 638 | 5.3% |
| M | 560 | 4.6% |
| l | 525 | 4.4% |
| o | 523 | 4.3% |
| Other values (49) | 4449 |
| Value | Count | Frequency (%) |
| 1375 | 11.5% | |
| r | 949 | 7.9% |
| e | 855 | 7.1% |
| a | 813 | 6.8% |
| i | 662 | 5.5% |
| n | 650 | 5.4% |
| s | 641 | 5.3% |
| M | 559 | 4.7% |
| l | 539 | 4.5% |
| o | 520 | 4.3% |
| Other values (50) | 4444 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 12055 |
| Value | Count | Frequency (%) |
| (unknown) | 12007 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1377 | 11.4% | |
| r | 1009 | 8.4% |
| e | 883 | 7.3% |
| a | 799 | 6.6% |
| i | 651 | 5.4% |
| n | 641 | 5.3% |
| s | 638 | 5.3% |
| M | 560 | 4.6% |
| l | 525 | 4.4% |
| o | 523 | 4.3% |
| Other values (49) | 4449 |
| Value | Count | Frequency (%) |
| 1375 | 11.5% | |
| r | 949 | 7.9% |
| e | 855 | 7.1% |
| a | 813 | 6.8% |
| i | 662 | 5.5% |
| n | 650 | 5.4% |
| s | 641 | 5.3% |
| M | 559 | 4.7% |
| l | 539 | 4.5% |
| o | 520 | 4.3% |
| Other values (50) | 4444 |
Sex
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.6591928 | 4.7309417 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2078 | 2110 |
| Distinct characters | 5 | 5 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | male | male |
| 2nd row | male | male |
| 3rd row | female | male |
| 4th row | male | female |
| 5th row | male | female |
Common Values
| Value | Count | Frequency (%) |
| male | 299 | |
| female | 147 |
| Value | Count | Frequency (%) |
| male | 283 | |
| female | 163 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| male | 299 | |
| female | 147 |
| Value | Count | Frequency (%) |
| male | 283 | |
| female | 163 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 593 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 147 | 7.1% |
| Value | Count | Frequency (%) |
| e | 609 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 163 | 7.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2078 |
| Value | Count | Frequency (%) |
| (unknown) | 2110 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 593 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 147 | 7.1% |
| Value | Count | Frequency (%) |
| e | 609 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 163 | 7.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2078 |
| Value | Count | Frequency (%) |
| (unknown) | 2110 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 593 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 147 | 7.1% |
| Value | Count | Frequency (%) |
| e | 609 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 163 | 7.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2078 |
| Value | Count | Frequency (%) |
| (unknown) | 2110 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 593 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 147 | 7.1% |
| Value | Count | Frequency (%) |
| e | 609 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 163 | 7.7% |
Age
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 73 | 76 |
| Distinct (%) | 20.2% | 21.2% |
| Missing | 84 | 87 |
| Missing (%) | 18.8% | 19.5% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 29.606133 | 28.957047 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.75 |
| Maximum | 80 | 74 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.75 |
| 5-th percentile | 4 | 4 |
| Q1 | 21 | 20.25 |
| median | 28 | 28 |
| Q3 | 38 | 36 |
| 95-th percentile | 55.95 | 54 |
| Maximum | 80 | 74 |
| Range | 79.58 | 73.25 |
| Interquartile range (IQR) | 17 | 15.75 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 14.524854 | 14.014288 |
| Coefficient of variation (CV) | 0.49060287 | 0.48396813 |
| Kurtosis | 0.15947525 | 0.18745704 |
| Mean | 29.606133 | 28.957047 |
| Median Absolute Deviation (MAD) | 8 | 8 |
| Skewness | 0.38757552 | 0.38215533 |
| Sum | 10717.42 | 10395.58 |
| Variance | 210.97138 | 196.40027 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 24 | 20 | 4.5% |
| 18 | 15 | 3.4% |
| 28 | 14 | 3.1% |
| 22 | 12 | 2.7% |
| 21 | 12 | 2.7% |
| 25 | 12 | 2.7% |
| 30 | 12 | 2.7% |
| 35 | 11 | 2.5% |
| 27 | 10 | 2.2% |
| 36 | 10 | 2.2% |
| Other values (63) | 234 | |
| (Missing) | 84 | 18.8% |
| Value | Count | Frequency (%) |
| 24 | 18 | 4.0% |
| 28 | 15 | 3.4% |
| 22 | 14 | 3.1% |
| 18 | 13 | 2.9% |
| 21 | 13 | 2.9% |
| 30 | 13 | 2.9% |
| 29 | 13 | 2.9% |
| 25 | 13 | 2.9% |
| 19 | 11 | 2.5% |
| 26 | 10 | 2.2% |
| Other values (66) | 226 | |
| (Missing) | 87 | 19.5% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.75 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 2 | 0.4% |
| 2 | 6 | |
| 3 | 3 | |
| 4 | 6 | |
| 5 | 2 | 0.4% |
| 7 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0.75 | 2 | 0.4% |
| 0.83 | 2 | 0.4% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 6 | |
| 3 | 1 | 0.2% |
| 4 | 5 | |
| 5 | 1 | 0.2% |
| 6 | 1 | 0.2% |
| 7 | 3 |
| Value | Count | Frequency (%) |
| 0.75 | 2 | 0.4% |
| 0.83 | 2 | 0.4% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 6 | |
| 3 | 1 | 0.2% |
| 4 | 5 | |
| 5 | 1 | 0.2% |
| 6 | 1 | 0.2% |
| 7 | 3 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.75 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 2 | 0.4% |
| 2 | 6 | |
| 3 | 3 | |
| 4 | 6 | |
| 5 | 2 | 0.4% |
| 7 | 2 | 0.4% |
SibSp
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.53811659 | 0.52690583 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 8 | 8 |
| Zeros | 306 | 293 |
| Zeros (%) | 68.6% | 65.7% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 3 | 2 |
| Maximum | 8 | 8 |
| Range | 8 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 1.108549 | 0.97747339 |
| Coefficient of variation (CV) | 2.0600536 | 1.8551197 |
| Kurtosis | 15.487631 | 12.147253 |
| Mean | 0.53811659 | 0.52690583 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.4129477 | 2.9624521 |
| Sum | 240 | 235 |
| Variance | 1.2288809 | 0.95545422 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 306 | |
| 1 | 95 | 21.3% |
| 2 | 20 | 4.5% |
| 3 | 10 | 2.2% |
| 4 | 9 | 2.0% |
| 5 | 3 | 0.7% |
| 8 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 293 | |
| 1 | 113 | 25.3% |
| 2 | 18 | 4.0% |
| 4 | 9 | 2.0% |
| 3 | 9 | 2.0% |
| 5 | 3 | 0.7% |
| 8 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 306 | |
| 1 | 95 | 21.3% |
| 2 | 20 | 4.5% |
| 3 | 10 | 2.2% |
| 4 | 9 | 2.0% |
| 5 | 3 | 0.7% |
| 8 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 293 | |
| 1 | 113 | 25.3% |
| 2 | 18 | 4.0% |
| 3 | 9 | 2.0% |
| 4 | 9 | 2.0% |
| 5 | 3 | 0.7% |
| 8 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 293 | |
| 1 | 113 | 25.3% |
| 2 | 18 | 4.0% |
| 3 | 9 | 2.0% |
| 4 | 9 | 2.0% |
| 5 | 3 | 0.7% |
| 8 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 306 | |
| 1 | 95 | 21.3% |
| 2 | 20 | 4.5% |
| 3 | 10 | 2.2% |
| 4 | 9 | 2.0% |
| 5 | 3 | 0.7% |
| 8 | 3 | 0.7% |
Parch
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 6 | 7 |
| Distinct (%) | 1.3% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.39910314 | 0.39237668 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 5 | 6 |
| Zeros | 337 | 339 |
| Zeros (%) | 75.6% | 76.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 0 | 0 |
| 95-th percentile | 2 | 2 |
| Maximum | 5 | 6 |
| Range | 5 | 6 |
| Interquartile range (IQR) | 0 | 0 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.83593403 | 0.82425871 |
| Coefficient of variation (CV) | 2.0945313 | 2.1006822 |
| Kurtosis | 8.8810356 | 9.9949191 |
| Mean | 0.39910314 | 0.39237668 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.6870051 | 2.7416945 |
| Sum | 178 | 175 |
| Variance | 0.69878571 | 0.67940243 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 59 | 13.2% |
| 2 | 41 | 9.2% |
| 5 | 4 | 0.9% |
| 3 | 3 | 0.7% |
| 4 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 339 | |
| 1 | 55 | 12.3% |
| 2 | 45 | 10.1% |
| 3 | 2 | 0.4% |
| 5 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 59 | 13.2% |
| 2 | 41 | 9.2% |
| 3 | 3 | 0.7% |
| 4 | 2 | 0.4% |
| 5 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 339 | |
| 1 | 55 | 12.3% |
| 2 | 45 | 10.1% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 5 | 2 | 0.4% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 339 | |
| 1 | 55 | 12.3% |
| 2 | 45 | 10.1% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 5 | 2 | 0.4% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 59 | 13.2% |
| 2 | 41 | 9.2% |
| 3 | 3 | 0.7% |
| 4 | 2 | 0.4% |
| 5 | 4 | 0.9% |
Ticket
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 379 | 384 |
| Distinct (%) | 85.0% | 86.1% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.7443946 | 6.8475336 |
| Min length | 4 | 3 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 3008 | 3054 |
| Distinct characters | 31 | 32 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 331 | 339 ? |
| Unique (%) | 74.2% | 76.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 239853 | 3101295 |
| 2nd row | 26707 | C 17369 |
| 3rd row | 392096 | STON/O 2. 3101288 |
| 4th row | 349209 | 2668 |
| 5th row | 364851 | 36866 |
| Value | Count | Frequency (%) |
| pc | 31 | 5.5% |
| c.a | 14 | 2.5% |
| a/5 | 10 | 1.8% |
| ston/o | 7 | 1.2% |
| 2 | 7 | 1.2% |
| ca | 7 | 1.2% |
| w./c | 6 | 1.1% |
| 347082 | 5 | 0.9% |
| soton/oq | 5 | 0.9% |
| soton/o.q | 4 | 0.7% |
| Other values (396) | 472 |
| Value | Count | Frequency (%) |
| pc | 27 | 4.7% |
| c.a | 14 | 2.4% |
| a/5 | 10 | 1.7% |
| ston/o | 8 | 1.4% |
| 2 | 8 | 1.4% |
| w./c | 6 | 1.0% |
| soton/o.q | 6 | 1.0% |
| ca | 6 | 1.0% |
| sc/paris | 5 | 0.9% |
| soton/oq | 4 | 0.7% |
| Other values (403) | 480 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 370 | |
| 1 | 338 | |
| 2 | 302 | |
| 7 | 254 | |
| 4 | 222 | 7.4% |
| 6 | 218 | 7.2% |
| 0 | 199 | 6.6% |
| 5 | 186 | 6.2% |
| 8 | 154 | 5.1% |
| 9 | 154 | 5.1% |
| Other values (21) | 611 |
| Value | Count | Frequency (%) |
| 3 | 364 | |
| 1 | 347 | |
| 2 | 292 | |
| 4 | 236 | 7.7% |
| 7 | 234 | 7.7% |
| 0 | 219 | 7.2% |
| 6 | 208 | 6.8% |
| 5 | 188 | 6.2% |
| 9 | 167 | 5.5% |
| 8 | 147 | 4.8% |
| Other values (22) | 652 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3008 |
| Value | Count | Frequency (%) |
| (unknown) | 3054 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 370 | |
| 1 | 338 | |
| 2 | 302 | |
| 7 | 254 | |
| 4 | 222 | 7.4% |
| 6 | 218 | 7.2% |
| 0 | 199 | 6.6% |
| 5 | 186 | 6.2% |
| 8 | 154 | 5.1% |
| 9 | 154 | 5.1% |
| Other values (21) | 611 |
| Value | Count | Frequency (%) |
| 3 | 364 | |
| 1 | 347 | |
| 2 | 292 | |
| 4 | 236 | 7.7% |
| 7 | 234 | 7.7% |
| 0 | 219 | 7.2% |
| 6 | 208 | 6.8% |
| 5 | 188 | 6.2% |
| 9 | 167 | 5.5% |
| 8 | 147 | 4.8% |
| Other values (22) | 652 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3008 |
| Value | Count | Frequency (%) |
| (unknown) | 3054 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 370 | |
| 1 | 338 | |
| 2 | 302 | |
| 7 | 254 | |
| 4 | 222 | 7.4% |
| 6 | 218 | 7.2% |
| 0 | 199 | 6.6% |
| 5 | 186 | 6.2% |
| 8 | 154 | 5.1% |
| 9 | 154 | 5.1% |
| Other values (21) | 611 |
| Value | Count | Frequency (%) |
| 3 | 364 | |
| 1 | 347 | |
| 2 | 292 | |
| 4 | 236 | 7.7% |
| 7 | 234 | 7.7% |
| 0 | 219 | 7.2% |
| 6 | 208 | 6.8% |
| 5 | 188 | 6.2% |
| 9 | 167 | 5.5% |
| 8 | 147 | 4.8% |
| Other values (22) | 652 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3008 |
| Value | Count | Frequency (%) |
| (unknown) | 3054 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 370 | |
| 1 | 338 | |
| 2 | 302 | |
| 7 | 254 | |
| 4 | 222 | 7.4% |
| 6 | 218 | 7.2% |
| 0 | 199 | 6.6% |
| 5 | 186 | 6.2% |
| 8 | 154 | 5.1% |
| 9 | 154 | 5.1% |
| Other values (21) | 611 |
| Value | Count | Frequency (%) |
| 3 | 364 | |
| 1 | 347 | |
| 2 | 292 | |
| 4 | 236 | 7.7% |
| 7 | 234 | 7.7% |
| 0 | 219 | 7.2% |
| 6 | 208 | 6.8% |
| 5 | 188 | 6.2% |
| 9 | 167 | 5.5% |
| 8 | 147 | 4.8% |
| Other values (22) | 652 |
Fare
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 185 | 176 |
| Distinct (%) | 41.5% | 39.5% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 32.388554 | 32.95637 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 512.3292 | 512.3292 |
| Zeros | 9 | 7 |
| Zeros (%) | 2.0% | 1.6% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.125 | 7.162525 |
| Q1 | 7.8958 | 7.925 |
| median | 13.93125 | 14.45625 |
| Q3 | 31.275 | 30.375 |
| 95-th percentile | 118.31875 | 120 |
| Maximum | 512.3292 | 512.3292 |
| Range | 512.3292 | 512.3292 |
| Interquartile range (IQR) | 23.3792 | 22.45 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 48.137637 | 50.150937 |
| Coefficient of variation (CV) | 1.4862546 | 1.5217373 |
| Kurtosis | 27.718145 | 24.872053 |
| Mean | 32.388554 | 32.95637 |
| Median Absolute Deviation (MAD) | 6.70205 | 6.81665 |
| Skewness | 4.2873395 | 4.1680823 |
| Sum | 14445.295 | 14698.541 |
| Variance | 2317.2321 | 2515.1165 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 8.05 | 22 | 4.9% |
| 13 | 19 | 4.3% |
| 7.8958 | 19 | 4.3% |
| 10.5 | 16 | 3.6% |
| 7.75 | 10 | 2.2% |
| 26 | 10 | 2.2% |
| 0 | 9 | 2.0% |
| 26.55 | 8 | 1.8% |
| 7.2292 | 8 | 1.8% |
| 7.925 | 8 | 1.8% |
| Other values (175) | 317 |
| Value | Count | Frequency (%) |
| 7.8958 | 24 | 5.4% |
| 26 | 22 | 4.9% |
| 8.05 | 20 | 4.5% |
| 13 | 18 | 4.0% |
| 7.75 | 15 | 3.4% |
| 10.5 | 12 | 2.7% |
| 7.925 | 12 | 2.7% |
| 7.775 | 10 | 2.2% |
| 8.6625 | 9 | 2.0% |
| 0 | 7 | 1.6% |
| Other values (166) | 297 |
| Value | Count | Frequency (%) |
| 0 | 9 | |
| 6.45 | 1 | 0.2% |
| 6.4958 | 2 | 0.4% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 2 | 0.4% |
| 7.05 | 3 | 0.7% |
| 7.0542 | 1 | 0.2% |
| 7.125 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 5 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 4 | |
| 7.0542 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 5 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 4 | |
| 7.0542 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 9 | |
| 6.45 | 1 | 0.2% |
| 6.4958 | 2 | 0.4% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 2 | 0.4% |
| 7.05 | 3 | 0.7% |
| 7.0542 | 1 | 0.2% |
| 7.125 | 3 | 0.7% |
Cabin
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 88 | 88 |
| Distinct (%) | 85.4% | 86.3% |
| Missing | 343 | 344 |
| Missing (%) | 76.9% | 77.1% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 15 | 15 |
| Median length | 3 | 3 |
| Mean length | 3.5436893 | 3.8529412 |
| Min length | 2 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 365 | 393 |
| Distinct characters | 18 | 19 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 76 | 77 ? |
| Unique (%) | 73.8% | 75.5% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | E121 | E10 |
| 2nd row | C65 | D48 |
| 3rd row | A10 | E8 |
| 4th row | B35 | E63 |
| 5th row | C22 C26 | E40 |
| Value | Count | Frequency (%) |
| c22 | 3 | 2.5% |
| f2 | 3 | 2.5% |
| f33 | 3 | 2.5% |
| c26 | 3 | 2.5% |
| c92 | 2 | 1.7% |
| e101 | 2 | 1.7% |
| b5 | 2 | 1.7% |
| c25 | 2 | 1.7% |
| c23 | 2 | 1.7% |
| c27 | 2 | 1.7% |
| Other values (89) | 94 |
| Value | Count | Frequency (%) |
| c22 | 3 | 2.4% |
| b96 | 3 | 2.4% |
| b98 | 3 | 2.4% |
| c23 | 3 | 2.4% |
| c25 | 3 | 2.4% |
| c27 | 3 | 2.4% |
| c26 | 3 | 2.4% |
| c2 | 2 | 1.6% |
| e8 | 2 | 1.6% |
| g6 | 2 | 1.6% |
| Other values (91) | 99 |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 41 | |
| 2 | 38 | |
| 1 | 34 | 9.3% |
| 3 | 32 | 8.8% |
| B | 26 | 7.1% |
| 6 | 24 | 6.6% |
| 0 | 21 | 5.8% |
| 5 | 20 | 5.5% |
| 8 | 19 | 5.2% |
| 4 | 16 | 4.4% |
| Other values (8) | 94 |
| Value | Count | Frequency (%) |
| C | 40 | |
| 2 | 40 | |
| 1 | 36 | 9.2% |
| B | 34 | 8.7% |
| 6 | 32 | 8.1% |
| 5 | 30 | 7.6% |
| 24 | 6.1% | |
| 7 | 21 | 5.3% |
| 3 | 20 | 5.1% |
| D | 19 | 4.8% |
| Other values (9) | 97 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 365 |
| Value | Count | Frequency (%) |
| (unknown) | 393 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| C | 41 | |
| 2 | 38 | |
| 1 | 34 | 9.3% |
| 3 | 32 | 8.8% |
| B | 26 | 7.1% |
| 6 | 24 | 6.6% |
| 0 | 21 | 5.8% |
| 5 | 20 | 5.5% |
| 8 | 19 | 5.2% |
| 4 | 16 | 4.4% |
| Other values (8) | 94 |
| Value | Count | Frequency (%) |
| C | 40 | |
| 2 | 40 | |
| 1 | 36 | 9.2% |
| B | 34 | 8.7% |
| 6 | 32 | 8.1% |
| 5 | 30 | 7.6% |
| 24 | 6.1% | |
| 7 | 21 | 5.3% |
| 3 | 20 | 5.1% |
| D | 19 | 4.8% |
| Other values (9) | 97 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 365 |
| Value | Count | Frequency (%) |
| (unknown) | 393 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| C | 41 | |
| 2 | 38 | |
| 1 | 34 | 9.3% |
| 3 | 32 | 8.8% |
| B | 26 | 7.1% |
| 6 | 24 | 6.6% |
| 0 | 21 | 5.8% |
| 5 | 20 | 5.5% |
| 8 | 19 | 5.2% |
| 4 | 16 | 4.4% |
| Other values (8) | 94 |
| Value | Count | Frequency (%) |
| C | 40 | |
| 2 | 40 | |
| 1 | 36 | 9.2% |
| B | 34 | 8.7% |
| 6 | 32 | 8.1% |
| 5 | 30 | 7.6% |
| 24 | 6.1% | |
| 7 | 21 | 5.3% |
| 3 | 20 | 5.1% |
| D | 19 | 4.8% |
| Other values (9) | 97 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 365 |
| Value | Count | Frequency (%) |
| (unknown) | 393 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| C | 41 | |
| 2 | 38 | |
| 1 | 34 | 9.3% |
| 3 | 32 | 8.8% |
| B | 26 | 7.1% |
| 6 | 24 | 6.6% |
| 0 | 21 | 5.8% |
| 5 | 20 | 5.5% |
| 8 | 19 | 5.2% |
| 4 | 16 | 4.4% |
| Other values (8) | 94 |
| Value | Count | Frequency (%) |
| C | 40 | |
| 2 | 40 | |
| 1 | 36 | 9.2% |
| B | 34 | 8.7% |
| 6 | 32 | 8.1% |
| 5 | 30 | 7.6% |
| 24 | 6.1% | |
| 7 | 21 | 5.3% |
| 3 | 20 | 5.1% |
| D | 19 | 4.8% |
| Other values (9) | 97 |
Embarked
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 1 | 0 |
| Missing (%) | 0.2% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| S | |
|---|---|
| C | |
| Q |
| S | |
|---|---|
| C | |
| Q |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 445 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | S | S |
| 2nd row | S | S |
| 3rd row | S | S |
| 4th row | S | C |
| 5th row | Q | Q |
Common Values
| Value | Count | Frequency (%) |
| S | 326 | |
| C | 82 | 18.4% |
| Q | 37 | 8.3% |
| (Missing) | 1 | 0.2% |
| Value | Count | Frequency (%) |
| S | 333 | |
| C | 76 | 17.0% |
| Q | 37 | 8.3% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| s | 326 | |
| c | 82 | 18.4% |
| q | 37 | 8.3% |
| Value | Count | Frequency (%) |
| s | 333 | |
| c | 76 | 17.0% |
| q | 37 | 8.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 326 | |
| C | 82 | 18.4% |
| Q | 37 | 8.3% |
| Value | Count | Frequency (%) |
| S | 333 | |
| C | 76 | 17.0% |
| Q | 37 | 8.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 326 | |
| C | 82 | 18.4% |
| Q | 37 | 8.3% |
| Value | Count | Frequency (%) |
| S | 333 | |
| C | 76 | 17.0% |
| Q | 37 | 8.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 326 | |
| C | 82 | 18.4% |
| Q | 37 | 8.3% |
| Value | Count | Frequency (%) |
| S | 333 | |
| C | 76 | 17.0% |
| Q | 37 | 8.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 326 | |
| C | 82 | 18.4% |
| Q | 37 | 8.3% |
| Value | Count | Frequency (%) |
| S | 333 | |
| C | 76 | 17.0% |
| Q | 37 | 8.3% |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 277 | 278 | 0 | 2 | Parkes, Mr. Francis "Frank" | male | NaN | 0 | 0 | 239853 | 0.0000 | NaN | S |
| 236 | 237 | 0 | 2 | Hold, Mr. Stephen | male | 44.0 | 1 | 0 | 26707 | 26.0000 | NaN | S |
| 823 | 824 | 1 | 3 | Moor, Mrs. (Beila) | female | 27.0 | 0 | 1 | 392096 | 12.4750 | E121 | S |
| 514 | 515 | 0 | 3 | Coleff, Mr. Satio | male | 24.0 | 0 | 0 | 349209 | 7.4958 | NaN | S |
| 428 | 429 | 0 | 3 | Flynn, Mr. James | male | NaN | 0 | 0 | 364851 | 7.7500 | NaN | Q |
| 774 | 775 | 1 | 2 | Hocking, Mrs. Elizabeth (Eliza Needs) | female | 54.0 | 1 | 3 | 29105 | 23.0000 | NaN | S |
| 307 | 308 | 1 | 1 | Penasco y Castellana, Mrs. Victor de Satode (Maria Josefa Perez de Soto y Vallejo) | female | 17.0 | 1 | 0 | PC 17758 | 108.9000 | C65 | C |
| 624 | 625 | 0 | 3 | Bowen, Mr. David John "Dai" | male | 21.0 | 0 | 0 | 54636 | 16.1000 | NaN | S |
| 583 | 584 | 0 | 1 | Ross, Mr. John Hugo | male | 36.0 | 0 | 0 | 13049 | 40.1250 | A10 | C |
| 355 | 356 | 0 | 3 | Vanden Steen, Mr. Leo Peter | male | 28.0 | 0 | 0 | 345783 | 9.5000 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 50 | 51 | 0 | 3 | Panula, Master. Juha Niilo | male | 7.0 | 4 | 1 | 3101295 | 39.6875 | NaN | S |
| 127 | 128 | 1 | 3 | Madsen, Mr. Fridtjof Arne | male | 24.0 | 0 | 0 | C 17369 | 7.1417 | NaN | S |
| 744 | 745 | 1 | 3 | Stranden, Mr. Juho | male | 31.0 | 0 | 0 | STON/O 2. 3101288 | 7.9250 | NaN | S |
| 533 | 534 | 1 | 3 | Peter, Mrs. Catherine (Catherine Rizk) | female | NaN | 0 | 2 | 2668 | 22.3583 | NaN | C |
| 727 | 728 | 1 | 3 | Mannion, Miss. Margareth | female | NaN | 0 | 0 | 36866 | 7.7375 | NaN | Q |
| 274 | 275 | 1 | 3 | Healy, Miss. Hanora "Nora" | female | NaN | 0 | 0 | 370375 | 7.7500 | NaN | Q |
| 694 | 695 | 0 | 1 | Weir, Col. John | male | 60.0 | 0 | 0 | 113800 | 26.5500 | NaN | S |
| 471 | 472 | 0 | 3 | Cacic, Mr. Luka | male | 38.0 | 0 | 0 | 315089 | 8.6625 | NaN | S |
| 429 | 430 | 1 | 3 | Pickard, Mr. Berk (Berk Trembisky) | male | 32.0 | 0 | 0 | SOTON/O.Q. 392078 | 8.0500 | E10 | S |
| 729 | 730 | 0 | 3 | Ilmakangas, Miss. Pieta Sofia | female | 25.0 | 1 | 0 | STON/O2. 3101271 | 7.9250 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 83 | 84 | 0 | 1 | Carrau, Mr. Francisco M | male | 28.0 | 0 | 0 | 113059 | 47.1000 | NaN | S |
| 847 | 848 | 0 | 3 | Markoff, Mr. Marin | male | 35.0 | 0 | 0 | 349213 | 7.8958 | NaN | C |
| 191 | 192 | 0 | 2 | Carbines, Mr. William | male | 19.0 | 0 | 0 | 28424 | 13.0000 | NaN | S |
| 743 | 744 | 0 | 3 | McNamee, Mr. Neal | male | 24.0 | 1 | 0 | 376566 | 16.1000 | NaN | S |
| 12 | 13 | 0 | 3 | Saundercock, Mr. William Henry | male | 20.0 | 0 | 0 | A/5. 2151 | 8.0500 | NaN | S |
| 92 | 93 | 0 | 1 | Chaffee, Mr. Herbert Fuller | male | 46.0 | 1 | 0 | W.E.P. 5734 | 61.1750 | E31 | S |
| 213 | 214 | 0 | 2 | Givard, Mr. Hans Kristensen | male | 30.0 | 0 | 0 | 250646 | 13.0000 | NaN | S |
| 304 | 305 | 0 | 3 | Williams, Mr. Howard Hugh "Harry" | male | NaN | 0 | 0 | A/5 2466 | 8.0500 | NaN | S |
| 829 | 830 | 1 | 1 | Stone, Mrs. George Nelson (Martha Evelyn) | female | 62.0 | 0 | 0 | 113572 | 80.0000 | B28 | NaN |
| 185 | 186 | 0 | 1 | Rood, Mr. Hugh Roscoe | male | NaN | 0 | 0 | 113767 | 50.0000 | A32 | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 232 | 233 | 0 | 2 | Sjostedt, Mr. Ernst Adolf | male | 59.0 | 0 | 0 | 237442 | 13.5000 | NaN | S |
| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38.0 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |
| 616 | 617 | 0 | 3 | Danbom, Mr. Ernst Gilbert | male | 34.0 | 1 | 1 | 347080 | 14.4000 | NaN | S |
| 480 | 481 | 0 | 3 | Goodwin, Master. Harold Victor | male | 9.0 | 5 | 2 | CA 2144 | 46.9000 | NaN | S |
| 234 | 235 | 0 | 2 | Leyson, Mr. Robert William Norman | male | 24.0 | 0 | 0 | C.A. 29566 | 10.5000 | NaN | S |
| 93 | 94 | 0 | 3 | Dean, Mr. Bertram Frank | male | 26.0 | 1 | 2 | C.A. 2315 | 20.5750 | NaN | S |
| 830 | 831 | 1 | 3 | Yasbeck, Mrs. Antoni (Selini Alexander) | female | 15.0 | 1 | 0 | 2659 | 14.4542 | NaN | C |
| 716 | 717 | 1 | 1 | Endres, Miss. Caroline Louise | female | 38.0 | 0 | 0 | PC 17757 | 227.5250 | C45 | C |
| 585 | 586 | 1 | 1 | Taussig, Miss. Ruth | female | 18.0 | 0 | 2 | 110413 | 79.6500 | E68 | S |
| 771 | 772 | 0 | 3 | Jensen, Mr. Niels Peder | male | 48.0 | 0 | 0 | 350047 | 7.8542 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||